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Objectives. While the question of who is likely to be selected for clinical psychology 
training has been studied, evidence on performance during training is scant. This 
study explored data from seven consecutive intakes of the UK's largest clinical 
psychology training course, aiming to identify what factors predict better or poorer 
outcomes. 

Design. Longitudinal cross-sectional study using prospective and retrospective data. 

Method. Characteristics at application were analysed in relation to a range of in-course 
assessments for 274 trainee clinical psychologists who had completed or were in the final 
stage of their training. 

Results. Trainees were diverse in age, pre-training experience, and academic perfor- 
mance at A-level (advanced level certificate required for university admission), but not in 
gender or ethnicity. Failure rates across the three performance domains (academic, 
clinical, research) were very low, suggesting that selection was successful in screening out 
less suitable candidates. Key predictors of good performance on the course were better 
A-levels and better degree class. Non-white students performed less well on two 
outcomes. Type and extent of pre-training clinical experience on outcomes had varied 
effects on outcome. Research supervisor ratings emerged as global indicators and 
predicted nearly all outcomes, but may have been biased as they were retrospective. 
Referee ratings predicted only one of the seven outcomes examined, and interview 
ratings predicted none of the outcomes. 

Conclusions. Predicting who will do well or poorly in clinical psychology training is 
complex. Interview and referee ratings may well be successful in screening out unsuitable 
candidates, but appear to be a poor guide to performance on the course. 
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Practitioner points 

• While referee and selection interview ratings did not predict performance during training, they may be 
useful in screening out unsuitable candidates at the application stage 

• High school final academic performance was the best predictor of good performance during clinical 
psychology training 

• The findings are derived from seven cohorts of one training course, the UK's largest; they cannot be 
assumed to generalize to all training courses 

Clinical psychology training in the United Kingdom is solely offered on a post-graduate 
basis, with a recognized degree in psychology as a prerequisite. It involves training in 
three areas over the course of three calendar years: academic teaching and study, clinical 
placements in the National Health Service (NHS), and research including completion of a 
doctoral thesis. Successful completion of all aspects leads to the award of a Doctorate in 
Clinical Psychology (DClinPsy or ClinPsyD). Training is underpinned by a scientist-prac- 
titioner model similar to the Boulder model adopted in the United States more than 
60 years ago (McFall, 2006). Training places for citizens of the United Kingdom or other 
countries in the European Economic Area are fully funded and salaried by the NHS. All 30 
training courses are accredited by the British Psychological Society, and also approved by 
the Health & Care Professions Council. 

Entry to clinical psychology training is highly competitive in the United Kingdom. All 
applications are administered by a national Clearing House (http://www.leeds.ac.uk/ 
chpccp), using a single application for up to four courses across the United Kingdom. The 
high ratio of applicants to training places (on average 3-8:1 across all UK courses during 
the period covered by this study, data provided by the Clearing House), and the generally 
high quality of applications, makes selection resource intensive. In response, there has 
been a lot of interest over recent years in the selection process. Some courses have 
introduced computerized or written tests as part of their short-listing process and work is 
underway to develop a national screening test. Despite their somewhat different selection 
criteria and processes, all courses shortlist on the application form submitted via the 
Clearing House and interview as part of the selection process. Although several studies 
have examined factors associated with application success (Phillips, Hatton, & Gray, 
2004; Scior, Gray, Halsey, & Roth, 2007), and selection procedures (Hemmings & 
Simpson, 2008; Simpson, Hemmings, Daiches, & Amor, 2010), the only UK-based study on 
prediction of performance during training demonstrated agreement between perfor- 
mance on a written short-listing task and academic performance on the course 
(Hemmings & Simpson, 2008), albeit with a small sample (N = 45). Several US studies 
have identified differences in the intakes and subsequent career paths associated with the 
three main clinical psychology training models used in the United States (clinical scientist, 
scientist-practitioner or practitioner-scholar; Cherry, Messenger, & Jacoby, 2000; McFall, 
2006). However, our searches did not reveal any English language studies other than 
Hemmings and Simpson's (2008) that examined factors associated with performance 
during training. Given the high resource costs involved in training clinical psychologists, 
and the substantial responsibility and power trainees have on qualification, it is surprising 
that evidence on predicting good or poor performance during clinical psychology training 
is sparse. A likely reason for this gap is the fact that attrition and failure are rare in clinical 
psychology training, requiring large samples to investigate. Furthermore, courses vary 
somewhat in assessment procedures, making it difficult to assess outcomes across training 
courses. 
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Predictors of underperformance in medicine 

It is useful to refer to the extensive literature on the predictors of dropout and academic 
underperformance at medical school. While medicine and clinical psychology have many 
differences, they also have important similarities: both require academic proficiency 
combined with communication and professional skills; both are highly selective; both 
qualify UK graduates for employment in the NHS; both have a strong professional identity. 

Medical admissions procedures attempt to select for academic and non-academic 
competence using a combination of school grades, aptitude test scores, personal or school 
statements, and interviews (Parry et al. , 2006). As in clinical psychology, there is concern 
in medicine about the demographic profile of the student population, especially in terms 
of gender and socioeconomic background (Elston, 2009; Mathers, Sitch, Marsh, & Parry, 
201 1). The medical education literature therefore explores which factors predict medical 
school outcomes. 



Pre-admission grades 

Pre-admission grades are consistent predictors of medical school performance (Ferguson, 
James, & Madeley, 2002). Lower school grades also predict dropout (O'Neill, Wallstedt, 
Eika, & Hartvigsen, 2011). However, the majority of medical applicants have top grades 
(McManus et al. , 2005), leading to the use of aptitude tests in selection, although with 
much debate about their usefulness (Emery, Bell, & Vidal Rodeiro, 2011; McManus et al, 
2005). Tests for selection into undergraduate medical courses in use in the United 
Kingdom and Australia seem not to have good predictive validity (McManus, Ferguson, 
Wakeford, Powis, & James, 201 1 ; Mercer & Puddey, 20 1 1 ; Wilkinson et al. , 2008; Yates & 
James, 2010), but MCAT, the test used in the United States where medicine is a graduate 
course like clinical psychology, appears to have reasonable predictive power (Donnon, 
Paolucci, & Violato, 2007). 

Interviews 

Traditional interviews have low predictive power (Goho & Blackman, 2006), and 
variations in interviewing methods make it hard to identify consistent relationships 
between interview characteristics and outcomes (Ferguson et al, 2002). Many medical 
schools now use the multiple-mini interview in which students are assessed on how they 
deal with professional situations in practice (Eva et al. , 2009). This seems to predict 
performance on later similar practical tests at medical school (Eva, Rosenfeld, Reiter, & 
Norman, 2004). 

Personal and academic references 

References are generally poor predictors (Ferguson, James, O'Hehir, Sanders, & 
McManus, 2003; Siu & Reiter, 2009), although negative comments from an academic 
referee may predict in-course difficulties (Yates & James, 2006), and a Canadian study 
showed personal statements and references to have a small predictive effect on medical 
school clinical performance (Peskun, Detsky, & Shandling, 2007). 

In the United Kingdom, medical school performance is also associated with 
demographics, with females and white students doing better, raising issues of equity 
(Ferguson et al, 2002; Higham & Steer, 2004; Woolf, McManus, Potts, & Dacre, 2013; 
Woolf, Potts, & McManus, 201 1). 
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Range restriction and other statistical challenges 

Studies generally correlate selection variables (e.g., pre-admission grades, interview 
performance) with outcome measures. However, outcome measures are only available on 
those who were selected and not, naturally, on those who were not accepted. How 
selection variables predict performance within the selected group is not the same as 
assessing the variables as a means of selection (Burt, 1943). Because of range restriction, 
observed correlations between selection variables and outcomes are generally smaller 
than they would be were outcomes available on all applicants. It is possible to make a 
correction for range restriction, but only with the right data available on non-successful 
applicants (Sackett & Yang, 2000). 

Other issues in the data - like the reliability of measures, ceiling effects, and ordinal 
outcome data - may also have the effect of reducing the observed correlations. Because of 
all these factors, the construct-level predictive validity can be much higher than observed 
correlations between selection variables and course outcomes (McManus et al, 2013). 

This study 

The clinical psychology doctorate course investigated is the largest one in the United 
Kingdom, with an average applicant to place ratio of around 28:1 for its 40-42 training 
places. The course employs a three-stage selection procedure involving course staff and 
local clinical psychology supervisors. Written guidelines for selectors aim for maximum 
fairness in selection. The course previously found that successful applications were 
predicted by A-level (academic qualification offered by educational institutions in 
England, Wales, and Northern Ireland to students completing high school education) 
points (see Methods) and academic and clinical referee ratings (Scior et al. , 2007), 
although selectors may rely particularly on these in the absence of other clear ways of 
distinguishing among hundreds of applicants with good honours degrees. 

The aim of this study was to investigate the role of applicant characteristics, interview 
ratings, and referee ratings in predicting course performance in three domains: academic, 
clinical, and research. In doing so, we aimed to inform future selection procedures by 
identifying the predictive power of information available to selectors. 

Methods 
Participants 

Overall, 274 trainee clinical psychologists (the entire 2002-2008 entry cohorts) who had 
completed all aspects of their training by the time of the study, or were in the process of 
making revisions to their doctoral theses, were included in the study (as was one 
individual who had completed all aspects of training other than the thesis due to an 
extension). Over the seven cohorts studied, the annual intake increased from 30 to 42. 
Two trainees dropped out of training during the period studied, both within the first year 
of training, due to personal reasons. Due to the small numbers concerned, it was not 
feasible to examine what factors may influence dropout and these two were not included 
in the study. 

Application form data 

Information about demographics, educational, and employment histories was taken from 
each Clearing House application form. A-level points were calculated using the British 
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Universities and Colleges Admissions Service tariff points system (grade A = 120 points; 
B = 100;C = 80;D = 60;E = 40), with points for each A-level subject added to make up a 
composite score. The degree class awarded for the first university degree was recorded, 
even when not psychology. For applicants with a 2.1 and a percentage mark, the degree 
was classified into low 2.1 <63.9%, mid 2.1 64-66.9%, or high 2.1 >67%. Whether the 
trainee had completed an MSc and/or PhD was recorded as binary data. The type of school 
attended was classified into state, grammar, or independent. The type of university (for 
first degree) was classified into Oxbridge, Russell Group, other pre-1992 universities, 
post-1992 universities, and non-UK universities. 

A substantive voluntary post in the NHS and/or work as a Research Assistant was 
recorded separately. Overall, clinical experience was classified as minimal (<1 year or mix 
of short-term voluntary and paid part-time roles, or roles not highly relevant to clinical 
psychology), moderate (at least 1 year of a role highly relevant to clinical psychology, for 
example, assistant psychologist, graduate mental health worker, low intensity Improving 
Access to Psychological Therapy (IAPT) programme worker, research assistant on a 
clinical project), or substantial (work at the 'moderate' level in a range of roles and 
services). Ratings were made by the first and last authors independently after rating 
several application forms together and discussing them to achieve consistency. 

Selection interview ratings 

We used ratings recorded from interviews and held in the database. Interview panels 
consisting of three interviewers, at least one member of course staff and at least one 
regional supervisor rated each interviewee on a 10-point scale, with 10 denoting 
outstanding performance and 0 exceptionally poor performance and all interim 
scale points having clear descriptors. From 2002 to 2006, academic (A) and clinical 
(B) interview performance was rated; from 2007 onwards, a third category of overall 
personal suitability was added (C), referring to communication skills, interpersonal 
style, reflexivity and self- awareness, and readiness to train. While service users 
advise the course on its selection procedures and interview questions, they are not 
directly involved in interviews, not least due to the practical challenges arising from 
holding large numbers of interviews, so it was not possible to obtain service user 
ratings. 

Outcome data 

These included contemporary outcomes, measured at the time of assessment, and 
retrospective ratings gathered for this study. 

Contemporary outcomes 

Academic performance. Case report scores and exam marks. Each trainee completed 
five case reports, marked as 'pass', 'minor revisions', 'stipulated revisions', 'major 
revisions', or 'fail'. An overall case report score was calculated as the number of case 
reports that were given stipulated or major revisions or failed. Trainees took two exams in 
year 1 and two in year 2, and scores were z-transformed to account for differences across 
cohorts. Although marks are analysed, trainees were only required to pass the exam to 
progress. 
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Clinical performance. Number of major concerns about performance on placement 
reported to exam boards, and number of placement failures. 

Research ability. Clinical viva result with 'pass' and 'one-month corrections' combined 
into one category, and compared to 'three-month corrections', 'one-year corrections' or 
'fail'. 



Retrospective ratings 

Course tutors were asked (yes/no) whether they had global concerns about their former 
tutees in three areas: interpersonal skills (A), robustness (B), and ability to think critically 

(C). 

Research supervisors retrospectively rated trainees' research performance using a 
5-point scale anchored with 0 = poor; 3 = average; 5 = outstanding. Supervisors were 
asked to consider six factors when making one overall rating: (1) the trainee's capacity to 
think scientifically; (2) research analysis skills; (3) quality of written work; (4) critical 
thinking skills; (5) organization and planning abilities; and (6) ability to work 
autonomously. 

Results 

Descriptive statistics 

Demographics 

The mean age of trainees at entry to the course was 27 years (range 21-51 years). 
Eighty-five percentage were females (n = 234), 15% were males (n = 40); 9% were from 
black and minority ethnic (BME) backgrounds (n = 25). On these indicators, trainees 
were very similar to the equivalent training cohorts across the United Kingdom of whom 
85.3% were females and 8.9% were from BME backgrounds (Clearing House data). 
Application form data on nine trainees were missing. 

Prior qualifications and experience 

There was marked variation in trainees' A-level performance . The mean A-level composite 
score was 353 (range 160-480). One hundred and thirty-one trainees had attended a state 
school, 42 a grammar school and 85 an independent school; data were unobtainable for 
16. 40% had afirst class degree Qi = 109), 55% a 2.1 (n = 150), and 2% a 2.2 (w = 5). Those 
with a 2.2 degree had subsequently shown strong performance in a subsequent 
undergraduate or post-graduate degree. Information about the type of degree could only 
be ascertained for half of the trainees with a 2.1 (n = 75). Of these, 71% had a high 2.1 
(n = 53), 16% a mid-2.1 (n = 18), and 13% a low 2.1 (n — 10). Thirty-five trainees had 
attended Oxford or Cambridge Universities, 27 Russell Group Universities, 156 other 
pre- 1992 universities, 35 post-1992 universities, and 11 non-UK universities (six of which 
were in the Republic of Ireland); data were unobtainable for 10. Prior qualifications were 
similar when comparing white and BME trainees. The two groups had similar A level 
scores (composite score for white trainees: M = 354.55, SD = 72.33; for BME trainees 
M = 333-63, SD = 72.08), £(251) = 1.30,/? = .20. The two groups did not differ by degree 
class, # 2 (2) = 1.05, p = .59. Of white trainees, 24% had attended Oxbridge or Russell 
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Group universities and a further 57% other pre-1992 universities, compared to 23% and 
59%, respectively, of BME trainees. 

The mean time between trainees obtaining their Graduate Basis for Chartering (GBC) 
with the British Psychological Society (BPS), either as a result of completing an accredited 
psychology degree or conversion diploma, and beginning the course was 3.3 years 
(SD = 2.0, range 0-16 years). A PhD had been obtained by 5% (n = 15), an MSc by 29% 
(n = 80). 

Applicants had varied relevant clinical experience at the time of application: 42% had 
minimal (n = 116), 40% moderate (n = 110), and 14% substantial (n = 39) experience. 
Eighty-six percentage had worked in the NHS (n = 237) and 64% as research assistants 
(n = 176). 

References 

All candidates were required to submit an academic and a clinical reference at application. 
Referees provide a rating alongside their narrative reference comparing the candidate to 
other clinical psychology applicants they had provided references for, using a 5-point 
scale with two anchors (1 = much worse than others; 5 = the best). The large number 
(n = 66) of missing ratings is due to referees not having acted as referee previously and 
thus being unable to compare. The mean rating by both academic and clinical referees was 
4.5 (both SD = 0.6); no trainee was given a rating below three. 

Interviews 

Mean interview ratings were: for part A (academic/theory) = 8.3 (SD - 0.8), for part B 
(clinical) = 8.4 (SD = 0.8), and for part C (personal suitability) = 8.9 (SD = 0.8) (the C 
rating was only available for the 84 trainees selected from 2007 onwards). 

Outcome data 

Contemporary outcomes 

Academic performance. The median number of case reports marked 'stipulated' or 
worse was two (interquartile range 1-3; range 0-5). Fifty-eight trainees received major 
revisions (2 months) or a fail for at least one report, including 1 1 who failed at least one 
case report. 

Clinical performance. Twelve trainees failed a placement or provoked serious 
concerns about their performance on placement. 

Research performance. Twenty-one trainees received either 3-month (n = 20) or 
1-year corrections to their theses; none failed their viva. These were combined for further 
analysis and compared to trainees who received a pass or minor revisions in their viva. 

Interrelationships between contemporary outcomes. The relationships between 
contemporary academic outcomes are shown in Table 1 . All exam marks were positively 
correlated. Higher exam marks were associated with a decreased chance of getting 
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Table I. Relationships between contemporary academic outcomes assessed by non-parametric 
correlations (Kendall's xb) 
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stipulated revisions or worse on case reports. A Kendall's coefficient of concordance 
across all four exam marks and the case report score was calculated. This was statistically 
significant, W = 0.22, p < .001, confirming that the variables were related to each other, 
but at quite a low level, so they were analysed separately rather than combining them into 
a global performance measure. The relationships between contemporary academic and 
research outcomes are shown in Table 2. 

Poor placement performance was related to poor exam performance, but there was no 
association between poor placement performance and case report marks, which raises 
the question of the extent to which case reports measure clinical knowledge and/or skills. 
While 58 trainees received at least one major revision or fail on their case reports, of the 1 2 
trainees with poor performance on placement, only three were among these 58. Trainees 
who received 3-month or 1-year corrections on their thesis were more likely to have 
received poorer grades in all exams, and in their case reports. 

Retrospective outcomes 

Course tutor ratings. Tutors raised concerns about the interpersonal skills of 20 
trainees across all intakes; about the robustness of 18 trainees; and the critical thinking 



Table 2. Relationships between contemporary academic and research outcomes 
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ability of 18 trainees. Concerns about interpersonal skills were statistically significantly 
related to concerns about robustness (Fisher exact p < .001) and critical thinking (Fisher 
exact p = .001), but the latter two were not significantly related (Fisher exact p = .11). 

Research supervisor ratings. The mean supervisor rating of trainees' research skills was 
3.5 (SD = 1.0). Raters used the full 5-point scale: 4% of the sample was rated as showing 
poor research skills (rating < 2), 13% as showing poor or below average skills 
(rating < 3), and 16% as outstanding (rating = 5). 

Predictors of performance on the DCIinPsy course 

Multivariate statistics were used generally, but given the small numbers with placement 
concerns and poor thesis performance, these were analysed using univariate tests only. 

Predicting poor clinical performance 

Table 3 shows the predictors of placement concerns/failure. A-level points, course tutor 
concerns in all areas, and research supervisor ratings were associated with poor 
performance. 

Predicting poor research performance (viva outcome) 

The predictors of research performance are shown in Table 4. Poorer viva outcome 
(3-month or 1-year corrections) was associated with course tutor concern over critical 
thinking, research supervisor ratings, and to a lesser extent with a longer time between 
obtaining GBC and start of training. 

Predicting exam performance 

Multiple regression was used to examine the predictors of exam performance. For each 
exam, an initial model regressed exam performance on to the retrospective ratings. 
A second model regressed exam performance on to the following pre-course variables: 
gender, ethnicity, age, A-level points, school type, degree class, university type, time since 
obtaining GBC, research assistant experience, clinical experience, NHS experience, and 
whether obtained MSc or PhD . To this , second model were added referee ratings , and then 
to these, interview ratings were added. 

Exam 1 -psychological theory, year 1. The retrospective ratings model was statistically 
significant, ^5,256 = 5.1, p < .001, adjusted R 2 = 7.3%. Only higher research supervisor 
ratings were significantly associated with better psychological theory exam marks, see 
Table 5. The pre-course variables regression model was also statistically significant, 
^17,224 = 2.8, p < .001, adjusted R 2 = 11.4%, with better degree class, higher A-level 
points, and not having attended a grammar school associated with higher exam 1 marks, 
see Table 5. Adding referee ratings to the pre-course variables resulted in a significant 
change mR 2 : F 2 141 = 3-7, p = .028, new adjusted R 2 = 12.2%. In this model, only higher 
degree class significantly positively predicted exam 1 marks. Better referee ratings were 
associated with lower exam 1 marks, but neither rating is quite statistically significant on 
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Table 3. Univariate predictors of 


concerns about placement performance/failure 
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Gender Fisher exact test 


.4 




Ethnicity Fisher exact test 


.090 




Age U = 1 783.5 


.3 




A-level points U =80 1 


.009** 




Degree class U = 1618 


.6 




Time lag since obtaining U = 1478.5 


1.0 




GBC 






Research assistant experience Fisher exact test 


1.0 




Clinical experience U = 1900.5 


.1 




NHS experience Fisher exact test 


.4 




Completed MSc Fisher exact test 


.2 




Completed PhD Fisher exact test 


1.0 




School type Fisher exact test 


.8 




University type Fisher exact test 


.3 


Referee Ratings 








Academic reference U = 998 


.5 




Clinical reference U = 520.5 


.06 


Interview ratings 








Academic rating (A) U = 1 507.5 


.8 




Clinical rating (B) U = 1 342 


.4 



Note. MHS = national health service; GBC = graduate basis for chartering. 
*p < .05; **p < .01; ***p < .001 



its own (0.1 > ps > 0.05). Finally, adding the interview ratings did not result in a 
significant change in R 2 : ^2,139 = 2.9, p = .056. Due to the large number of predictors 
entered, only those that emerged as significant are shown in Tables 5 and following; full 
details available on request. 

Exam 2 - research methods, year 1. The retrospective ratings model was statistically 
significant, ^5,256 = 10.0, p < .001, adjusted R 2 = 14.7%. Course tutor concerns over 
interpersonal skills, research supervisor ratings, and to a lesser extent tutor concerns over 
critical thinking were independently associated with research methods exam perfor- 
mance, see Table 6. The pre-course variables model was also statistically significant, 
-Pi7,224 ~ 4.0, p < .001, adjusted R 2 = 17.5%, with white ethnicity, better degree class, 
better A-levels and not attending a post- 1992 or non-UK university independently 
associated with higher exam 2 marks. Adding the referee ratings did not change the R 2 
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Table 4. Univariate predictors of concerns about research performance as measured by viva outcome 



Variable 



Test statistic 



Retrospective ratings 
Course tutor concern over 

interpersonal skills 
Course tutor concern over 

robustness 
Course tutor concern over 

critical thinking 
Research supervisor rating 
Pre-course variables 



Referee Ratings 



Interview ratings 



Fisher exact test .6 

Fisher exact test .4 

Fisher exact test .006** 

t 26 2 = 3. 1 .002** 

Gender Fisher exact test .7 

Ethnicity Fisher exact test .4 

Age U = 2442.5 .7 

A-level points U = 2166.5 .9 

Degree class U = 2866.5 .3 

Time lag since obtaining U = 3 I 18.5 .046*^ 
GBC 

Research assistant experience Fisher exact test .8 

Clinical experience U = 3101 .08 

NHS experience Fisher exact test .7 

Completed MSc Fisher exact test .3 

Completed PhD Fisher exact test .8 

School type Fisher exact test .6 

University type Fisher exact test .2 

Academic reference U = 1465 .6 

Clinical reference U = 1486.5 .9 

Academic rating (A) U = 2 1 55.5 . 1 

Clinical rating (B) U = 2 1 1 9.5 .1 



Note. NHS = National Health Service; GBC = Graduate Basis for Chartering. 
*p < .05; **p < .01; ***p < .001. 

^Median time lag for good performance 3 years; for poor performance median time lag 4 years. 



Table 5. Multivariate predictors of exam I marks 



Variable 



Retrospective ratings 

Research supervisor rating 0.2 1 .00 1 *** 

Pre-course variables 

A-level points 0.002 .0 1 5* 

Degree class 0.41 .001** 

Attending a grammar school —0.39 .035* 
Referee ratings added 

Degree class -0.5 1 .002** 



Note. *p < .05; **p < .01; ***p < .001. 
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Table 6. Multivariate predictors of exam 2 marks 



Variable 


8 


P 


Retrospective ratings 






Course tutor concern over interpersonal skills 


-0.78 


.002** 


Course tutor concern over critical thinking 


-0.61 


.021* 


Research supervisor rating 


0.26 


<.00l*** 


Pre-course variables 






Ethnicity (being non-white) 


-0.65 


.003** 


A-level points 


0.002 


.021* 


Degree class 


0.45 


<.00l** 


Attending a post- 1992 university 


-0.41 


.028* 


Attending a non-UK university 


-0.76 


.029* 



Note. *p < .05; **p < .01; ***p < .001. 



Table 7. Multivariate predictors of exam 3 marks 



Variable 


8 


P 


Retrospective ratings 






Research supervisor rating 


0.16 


.01* 


Pre-course variables 






A-level points 


0.009 


.007** 



Note. *p < .05; **p < .01; ***p < .001. 



significantly: F 2i 141 = 0.4, p = .7, or did subsequently adding the interview ratings: 
^2,139 = 11, /> = -3- Referee ratings and interview scores did not predict exam 2 marks. 

Exam 3 - advanced psychological theory, year 2. The retrospective ratings model was 
statistically significant: ^5 256 = 4.7,/? < .001, adjusted R 2 = 6.5%, and only the research 
supervisor rating was significantly associated with advanced psychological theory exam 
performance, see Table 7. The pre-course variables model was also statistically 
significant,^^ 224 = 1-8,/? = .025, adjusted R 2 = 5.6% with only A-level points predicting 
exam 3 marks. Adding the referee ratings did not change the R 2 significantly: F 2 141 = 0.7, 
p = 0.5, or did subsequently adding the interview ratings: F 2 ^ 9 — 1.4,p= .3- Referee 
ratings and interview scores did not predict exam 3 marks. 

Exam 4 — statistics, year 2. The retrospective ratings regression model was statistically 
significant, F 5 2 56 = 7.7, p < .001, adjusted R 2 = 11.4%, and only research supervisor 
ratings were significantly associated with statistics exam performance, see Table 8. The 
pre-course variables model was also statistically significant, Fij 2 24 = 4.9, p < .001, 
adjusted J? 2 = 21.7%. Younger age, being white, better A-levels, attending Oxbridge, not 
attending a post- 1992 or non-UK university and less clinical experience predicted higher 
statistics exam marks. Adding referee ratings did not change the R 2 significantly: 
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Table 8. Multivariate predictors of exam 4 marks (statistics) 


Variable 


B 


P 


Retrospective ratings 






Research supervisor rating 


0.33 


< 00l *** 


Pre-course variables 






Age 


-0.39 


.033* 


Ethnicity (being non-white) 


-5.0 


.039* 


A-level points 


0.022 


.036* 


Attending Oxbridge 


4.5 


.040* 


Attending a post- 1 992 university 


-4.3 


.035* 


Attending a non-UK university 


8.3 


.030* 


Cliniral pvnpripnrp 


—0.21 


.034* 


Note. *p < .05; **p < .01; ***f> < .001. 






Table 9. Multivariate predictors of case report marks 






Variable 


IRR 


P 


Retrospective ratings 






Research supervisor rating 


0.8 


< 00l *** 



Note. *p < .05; **p < .01; ***f> < .001. 



^2,141 = 0.6, p = .6. Then, adding the interview ratings did not change the R 2 significantly: 
7*2,139 = 0.3, p = .7. Referee ratings and interview scores did not predict exam 4 marks. 

Predicting case report marks 

Multiple Poisson regression was performed to predict the number of stipulated, major, 
and fail marks trainees received for their five case reports, along the same lines as the 
multiple linear regressions used to predict the exam marks. The retrospective ratings 
model was statistically significant: y 2 (5) = 19-9, p = .001. Only research supervisor 
ratings predicted case report marks, see Table 9- A multiple regression using the 
pre-course variables was not statistically significant: # 2 (17) = 17.6, p = .4. Adding the 
referee ratings did not change the deviance significantly: / 2 (2) = 0.7, p = .7. Then, adding 
the interview ratings also did not change the deviance significantly: y 2 {T) = 2.1, p = A. 
Referee ratings and interview scores did not predict case report marks. 

Poor performance, interview ratings, and pre-training background 

Given that interview ratings are a cornerstone of the selection process, we took a closer 
look at interview ratings for those trainees whose performance in at least one area during 
training was markedly poor. The 12 trainees with placement concerns were rated 
somewhat lower in sections A and B of their interviews (section C ratings were omitted as 
they were only available for one), but none of these differences approached significance. 
In terms of their clinical experience prior to starting the course, 1 0 of the 1 2 had worked in 
the NHS, four had minimal, three moderate, and five substantial clinical experience. 
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The data on the 1 1 trainees who failed at least one case report (one of whom failed two 
reports) again showed no marked differences on interview ratings, although their mean 
scores in B and C were somewhat lower than those without failed case reports. Four of 
these 1 1 had a first class degree, three an MSc, and one a PhD Their A-level points ranged 
from 260 to 480, and three had at least three A-levels at grade A. 

The 1 1 trainees rated as showing poor research skills by their supervisors (rating < 2) 
received similar ratings on interview parts B and C, but were rated as poorer on section A 
(M = 7.9, SD = 0.9) than those not rated poorly (M = 8.3, SD = 0.8). Of these 1 1 , six had 
an MSc (again following a 2. 1 degree), two a first class degrees, and one a PhD (following a 
2.1 degree). Their A-level points ranged from 240 to 360. 

Discussion 

Completion rates in clinical psychology training are very high, with dropout very much 
the exception. This study set out to identify whether selection ratings and applicant 
characteristics predict performance during clinical psychology training. In considering 
the results, it should be borne in mind that in view of the high applicant to place ratio 
(average 28: 1), the data presented here are very positively skewed as they only pertain to 
those successful in gaining a place; other than for A-level results, data variance was 
relatively small. It was not possible to make any corrections for range restriction. The 
actual predictive validity of the selection variables considered is probably higher than the 
observed relationships. The highly selective nature of the course makes it harder to see 
relationships between selection variables and outcomes, effectively reducing power. 
More research is needed to address this issue. 

The key findings can be summed up as follows: generally, performance on one part of 
the course was correlated with performance on other parts of the course, with exam 
results showing statistically significant small to medium correlations with clinical 
placement concerns, viva outcome, and case report marks. However, against expecta- 
tions, case reports correlated with academic, not clinical, performance, raising questions 
about the validity of case reports as indicators of clinical performance (cf . Simpson et al. , 
2010). From all the information available at selection, school leaving exam grades 
(A-levels) were the most important predictor of performance during training; as noted, 
they were also the only data that showed a reasonable range. They predicted marks on all 
four of the exams independently of other pre-course variables, and were univariately 
associated with clinical placement problems. This corroborates evidence from medicine 
where A-levels have been found to predict academic performance many years after 
graduation (McManus, Smithers, Partridge, Keeling, & Fleming, 2003). While caution has 
been urged about the use of A-levels in selection, given that they are influenced by social 
and educational advantage (Scior et al., 2007), in this study A-levels had a clear role in 
predicting performance. Although there was less variance in degree scores than in A-level 
scores, degree performance also predicted exam performance independently from 
A-levels on year 1 but not year 2 exams. University type was predictive of performance on 
the year 1 research methods exam and the year 2 statistics exam, with students who 
attended a post- 1992 institution or a university outside the United Kingdom performing 
worse, and Oxbridge students performing better, on the statistics exam. 

Demographic factors were also predictive: non- white students performed worse in the 
year 1 research methods exam and the year 2 statistics exam. Age also independently 
predicted the statistics exam. Trainees were relatively diverse in terms of age, but only a 
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small proportion were males (15%) and an even smaller proportion (9%) were from BME 
backgrounds. While the proportion of BME trainees compares fairly well with the 10% of 
people from BME backgrounds nationally (Office for National Statistics, 2011), as a 
London based course it compares poorly with the 34% of the Greater London population 
(Greater London Authority, 2011). The relationship observed in this sample between 
ethnicity and performance raises the concern that the underperformance of non-white 
students seen in medical education (Woolf et al, 2011) and more broadly in higher 
education (Richardson, 2008) may also be seen in clinical psychology. This warrants 
further investigation, particularly as the UK Equality Act 2010 places a duty on all public 
authorities, including universities and the NHS, to monitor admission and progress of 
students by ethnic group to be able to address inequalities or disadvantage. 

The finding that those with more clinical experience did worse in the statistics exam 
than those with minimal clinical experience is likely to be because the factors that 
impeded their exam performance also caused them to take longer to gain a training place, 
and thus they had more time to gain clinical experience. 

Retrospective research supervisor ratings correlated with all outcomes and predicted 
case report marks and three of the four exam marks suggesting that these may measure 
global course ability, rather than just research skills. In contrast, contemporary interview 
ratings and referee ratings were not generally predictive of performance. The exception 
was with a marginal relationship with one of the year 1 exams. The results here were 
complicated: when we added the references to the regression model, both the academic 
and clinical reference were negative predictors; this may be a spurious relationship, 
however. None of the information available at selection predicted case report 
performance in our multivariate analyses. 

The demographics of the trainee cohorts studied were very similar to the national 
picture of an average female: male ratio of 8.5:1 and a white: BME ratio of 9:1 (Clearing 
House data for 2002-2008). It is not possible to directly compare the academic 
qualifications of the present cohorts to the national picture as national data for the relevant 
period only records undergraduate results for those without post-graduate qualifications. 
However, given the high applicant: place ratio, those with first class degrees (40% across 
the cohorts studied) may be overrepresented (2 1 .7% nationally had a first class degree, but 
many of the 25.4% of trainees nationally with Masters and PhD qualifications may have also 
had a first). This suggests that the findings are of relevance to other training providers in 
the United Kingdom. Overall, in view of our data, trainers and selectors should consider 
paying attention to A-levels and to some extent degree mark and university type in 
reaching decisions about who is likely to perform well on clinical psychology courses. 
This may seem a very unwelcome conclusion and raises ethical concerns. The desire to 
balance selecting students who are likely to perform well during training must be weighed 
carefully against the desire to select a diverse student body and profession. Furthermore, 
evidence that individuals from BME backgrounds are over-represented in less highly 
regarded universities (Shiner & Modood, 2002; Turpin & Fensom, 2004) suggests that 
increased attention to applicants' academic history may run counter to attempts to widen 
access to the profession. One message does emerge clearly though from the findings: 
while references and interviews clearly play a crucial role in selection, they do not appear 
to predict actual performance during clinical training. This may well be because they help 
deselect unsuitable applicants, thus reducing the trainee body to individuals likely to 
broadly perform well, as suggested by very low drop-out and failure rates. However, those 
involved in reaching selection decisions may well wish to reconsider what relative 
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importance they pay to the range of information available about applicants seen for 
interview. 



Limitations 

The study findings relate only to one course, albeit the largest in the United Kingdom. 
Given that selection processes and criteria vary across courses, findings may not 
generalize to other settings. Furthermore, this was an exploratory study and we performed 
a large number of analyses; the possibility of type 1 error should be borne in mind. 

Our analyses relied on quantitative indicators that could be accessed. Many other 
variables, not least personality factors and life events, may well contribute to performance 
during training but were not measured here. Furthermore, many of the analyses relied on 
retrospective ratings, which may be unreliable. Research supervisors generally had fairly 
extensive contact with trainees under their supervision and made use of the full 5-point 
scale, indicating that they were able to recall trainees' performance fairly well. However, 
they suffer from the usual limitations of subjective ratings; due to the one-to-one 
supervisor-trainee relationship, it was not possible to assess inter-rater reliability. Course 
tutor ratings should be viewed with caution; due to the time lag involved, their reliability is 
questionable. It may be advisable for courses to collect contemporary ratings of trainee 
performance beyond academic and placement indicators, and to test their reliability and 
usefulness in monitoring trainee progress. The significant limitations of the data we had to 
rely on suggests that the reliability and validity of performance indicators commonly used 
during clinical psychology training merit careful consideration. Future research of this 
type should aim to use prospective data and more robust measures. 

Conclusions 

We want our selection methods to be as fair as possible a way of selecting among 
candidates who have all passed two previous stages of selection and who present 
relatively similar achievements to date. Our hope that interviews, and the applicants' 
references, would predict performance were disappointed. We presume that the 
interview process screens out unsuitable candidates given that drop-out and failure are 
very much the exception. The range of scores of both references and interview ratings 
was small for accepted applicants, so it is unclear whether the finding that they did not 
predict performance is to do with inadequate variance in the data, poor predictive power 
of the judgements which give rise to the ratings (cf. Stanton & Stephens, 2012), range 
restriction, or perhaps because interviews elicit valuable information which is neverthe- 
less not then summatively evaluated during training. 

Dropout and failure rates were very low, with all but two trainees who dropped out 
early completing their training successfully, even where certain aspects of training had to 
be repeated due to initial failure. Only 4% failed a case report, 1% a placement, and none 
failed their thesis viva. Further research is needed to understand how performance on the 
course relates to practice over the longer term as a clinical psychologist, and whether all 
those who complete training are indeed fit to practice. In medicine, it is commonly 
asserted that being a 'good' doctor is not (just) about performance in exams, and that other 
factors that are harder to measure and changeable make the difference between good, 
poor and average doctors (Journal British Medical, 2002). The same is probably true of 
clinical psychologists. With so many strong students applying for so few places, it is worth 
asking whether any selection methods can choose which students will perform best as 
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clinical psychologists after they qualify. Would a lottery system choosing among all 
students judged to meet entry criteria be fairer to applicants, trainees, and ultimately to 
service users (cf. Simpson, 1975)? Or are current attempts to develop a national screening 
test a move in the right direction? 
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